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FIG. 2 



The Greedy CaRT Selection Algorithm 



procedure Greedy (T(X) , e, G, 0) 

Input: n-attribute table T and n-vector of error tolerances e; 
Bayesian network G on the set of attributes X and 
threshold 8 on the relative benefit for selecting a 
CaRT predictor. 

Output: A set of materialized (predicted) attributes X ma t(Xp rec j 

= X - X ma t) and a CaRT predictor for each Xj € X pred • 

begin 

1- ^mat := X p re d := $ 

2. let < Xj, X2,...,X n > be the attributes in X sorted in 
topological order of G 

3. fori :=1,...,n 

4. if II (Xj) = * then X mat :=X mat u |Xj| /* Xj must be 
materialized if it has no parents in G */ 

5. else 

6. M := BuildCaRT (X mat — Xj , e;) 

7. if (MaterCost (Xj) / PredCost (X ma t — Xj) > 9) then X pre( j := 

X pred U |Xji 

8. else X mat := X mat u fX;) 

9. end 

10. end 
end 
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FIG. 3A 
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FIG. 4 

The MaxIndependentSet CaRT Selection Algorithm 

procedure MaxIndependentSet (T(X) , e, G, neighborhoodQ ) 
Input: n-attribute table T and n-vector of error tolerances e; 

Bayesian network G on the set of attributes X and function 

neighborhood () defining the "predictive neighborhood" of a 

node Xj in G (e.g., n or 0 (X;)). 
Output: A set of materialized (predicted) attributes X ma t(Xp rec j = X - 

X ma t) and a CaRT predictor PRED (Xj) — Xj for each Xj € X prec j. 

begin 

^mat := X> ^pred := $ 

2. PRED (Xj) := $ for all Xj € X, improve := true 

3. while (improve f false) do 



4. for each Xj e X ma t 

5. mater_neighbors (Xj) := 
(X ma tnneighborhood(Xj))u|PRED (X) : X € neighborhood 
(Xj) . X e XpredMXij 

6. M := BuildCaRT (Mater.neighbors (Xj) — Xj, e;) 

7. let PRED (Xj)^ mater_neighbors (Xj) be the set of 
predictor attributes used in M 

8. cost_changej :=0 

9. for each Xj £ X pre d such that Xj € PRED (Xj) 

10. NEW.PRED; (Xj) := PRED(Xj)-|X ;| uPRED (Xj) 

11. M :=BuildCaRT (NEW__PREDj(Xj) — Xj, ej 

12. set NEVLPREDj (Xj) to the (sub) set of 

predictor attributes used in M 

13. cost_changej := cosLchangej + (PredCost (PRED 

(Xj) — Xj) - PredCost (NEVLPRED; (Xj) — Xj)) 

14. end 

15. end 



V 



mi*** | 
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FIG. 4 (cont) 
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16. build an undirected, node-weighted graph G^ em p = (X ma t, 
^ternp) on the current set of materialized 

17. attributes X ma ^, where: 

18. (a) E temp := j (X,Y) : for all pairs X, Y GX pred | u 

19. ' | (Xj, Y) : for all Ye X mflt | 

20. (b) weight (Xj) := MaterCost (Xj) -PredCost (PRED(Xj) 
-►Xj) +cost_changej for each Xj € X ma t 

21. S := FindWMIS (Gt em p) /* select (approximate) maximum 
weight independent set in G ^emp 

22. as "maximum-benefit" subset of 

predicted attributes */ 

23. if (E X6S weight (X) < 0) then improve := false 

24. else/* update X ma j., Xp rec | , and the chosen CaRT predictors */ 

25. for each Xj e X p r€C | 

26. if (PRED (Xj) n S = {Xj)) then PRED (Xj) := 
NEVLPRED; (Xj) 

27. end 

28. X ma t:= X ma ^ - S, X precl : ~ ^pred u $ 

29. end 



30. end /* while */ 
end 
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FIG. 5 

Algorithm for Estimating Lower Bound on Subtree Cost 
procedure LowerBound (N, ej, b) 

Input: Leaf N for which lower bound on subtree cost is to be 
computed; error tolerance e\ for attribute Xj; bound b 
on the maximum number of internal nodes in subtree 
rooted at N. 

Output: Lower bound L(N) on cost of subtree rooted at N. 
begin 

1. for i := to r 

2. minOut [i, 0] :=i 

3. for J := 1 to b + 1 

4. minOut [0, j] :=0 

5. 1 :=0 

6. for i := 1 to r 

7. while x j - x > 2 e j 

8. 1 :=1 = 1 

9. for j := 1 to b + 1 

10. minOut [i,j] := min |minOut[i - 1,j] + 1, minOut [1,j-1] 

11. end 

12. L(N) := oo 

13. for J := 0 to b 

14. L(N) :=min |L(N) , 2j + 1 + j log (|Xj|) + (j + 1 + minOut 
(r, j+1)) log (|dom(Xj)|)j 

15. L(N) := min |L(N) , 2b + 3 + (b + 1) log (|Xi|) + (b +2) log 
(Idom(Xj) |) \ 

16. return L (N) 
end 
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FIG. 7A 
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